The (Un)expected Effects of Applying Standard Cleansing Models to Human Ratings on Compositionality
نویسندگان
چکیده
Human ratings are an important source for evaluating computational models that predict compositionality, but like many data sets of human semantic judgements, are often fraught with uncertainty and noise. However, despite their importance, to our knowledge there has been no extensive look at the effects of cleansing methods on human rating data. This paper assesses two standard cleansing approaches on two sets of compositionality ratings for German noun-noun compounds, in their ability to produce compositionality ratings of higher consistency, while reducing data quantity. We find (i) that our ratings are highly robust against aggressive filtering; (ii) Z-score filtering fails to detect unreliable item ratings; and (iii) Minimum Subject Agreement is highly effective at detecting unreliable subjects.
منابع مشابه
Assessment of the Effects of Economic Sanctions on Iranians’ Right to Health by Using Human Rights Impact Assessment Tool: A Systematic Review
Background Over the years, economic sanctions have contributed to violation of right to health in target countries. Iran has been under comprehensive unilateral economic sanctions by groups of countries (not United Nations [UN]) in recent years. They have been intensified from 2012 because of international community’s uncertainty about peaceful purpose of Iran’s nuclear program and inadequacy o...
متن کاملGhoSt-PV: A Representative Gold Standard of German Particle Verbs
German particle verbs represent a frequent type of multi-word-expression that forms a highly productive paradigm in the lexicon. Similarly to other multi-word expressions, particle verbs exhibit various levels of compositionality. One of the major obstacles for the study of compositionality is the lack of representative gold standards of human ratings. In order to address this bottleneck, this ...
متن کاملThe Effects of Physical, Human and Social Capitals on the Entrepreneurship Level of Economic actors in Shahid Salimi industrial town of Tabriz: Structural Equations and Order Logit Models
The main purpose of this study is to investigate the effects of physical capital, human capital and social capital on the entrepreneurship level of individuals, using structural equations model and order logit model in Shahid Salimi industrial town of Tabriz in 2016. The data were collected form 121 economic activist who were randomly selected form the population. The empirical results show tha...
متن کاملDetecting Compositionality of Multi-Word Expressions using Nearest Neighbours in Vector Space Models
We present a novel unsupervised approach to detecting the compositionality of multi-word expressions. We compute the compositionality of a phrase through substituting the constituent words with their “neighbours” in a semantic vector space and averaging over the distance between the original phrase and the substituted neighbour phrases. Several methods of obtaining neighbours are presented. The...
متن کاملNon-Verbal Communication in Models of Communicative Competence and L2 Teachers’ Rating
Non-verbal communication (NVC) plays a major role in various aspects of human life (Andersen, 2004; Cameron, 2001; Johnstone, 2008). Children learning their first language come to realize non-verbal communication as their socialization process takes place (Fletcher & German, 1990; Ingram, 1996; Owens, 2001). However, most EFL learners may have little exposure to these non-verbal aspects of comm...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013